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[Document] SPECIFICATION 

[Title of the Invention] METHOD OF DESCRIBING OBJECT 

REGION DATA, METHOD OF 
GENERATING OBJECT REGION DATA, 
VIDEO DATA PROCESSING METHOD, 
VIDEO DATA GENERATING APPARATUS 
AND DATA PROCESSING APPARATUS 

[What is claimed is:] 

[Claim 1] A method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method comprising: 

identifying a desired object region in a video 
according to at least either of a figure approximated to the 
object region or a characteristic point of the object 
region; approximating a trajectory obtained by arranging 
positions of representative points of the approximate figure 
or the characteristic points of the object region in 
a direction in which frames proceed with a predetermined 
function; and describing information about the object region 
by using the parameter of the function. 

[Claim 2] A method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method comprising; 

describing the object region data by using at least 
information capable of identifying said object, number or 
the time stamp of a leading frame and the frame number or 
the time stamp of a trailing frame of the plurality of the 
subject frames, information for identifying the type of the 
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figure of an approximate figure approximating the object 
region, and the parameter of a function with which 
a trajectory obtained by arranging position data of 
representative points of the approximate figure 
corresponding to the object, region in a direction in which 
frames proceed has been approximated. 

[Claim 3] A method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method comprising; 

describing the object region data by using at least 
information capable of identifying said object, the frame 
number or the time stamp of a leading frame and the frame 
number or the time stamp of a trailing frame of the 
plurality of the subject frames, the number of approximate 
figures approximating the object region, information for 
identifying the type of the figure of an approximate figure, 
and the parameters of functions with which trajectories 
corresponding to the approximate figures and obtained by 
arranging position data of representative points of each 
approximate figure in a direction in which frames proceed 
have been approximated. 

[Claim 4] A method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method comprising; 

describing the object region data by using at least 



information capable of identifying said object, the frame 
number of a leading frame and the frame n\mber of a trailing 
frame of the plurality of the subject frames, and the 
parameter of a function with which a trajectory obtained by 
arranging position data of characteristic points of the 
object region in a direction in which frames proceed has 
been approximated . 

[Claim 5] A recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to any one of claims 1 to 4 . 

[Claim 6] A recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to any one of claims 1 to 4 and information 
related to each object or information indicating a method of 
accessing to the related information. 

[Claim 7] A recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to any one of claims 1 to 4 and information 
for identifying information related to each object, and 
information related to each object. 

[Claim 8] The method of describing object region 
data according to any one of claims 1 to 4, wherein 

the method comprises 

describing information related to the object or 
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a method of accessing to the related information. 

[Claim 9] A video data processing method for 
determining whether or not a predetermined object has been 
specified in a screen which is displaying a video, the 
method comprising: 

obtaining information describing parameter of 
a function approximating a trajectory obtained by arranging 
position data of representative points of the approximate 
figure in a direction in which frames proceed when 
an arbitrary position has been specified in the screen in 
a case where a region of the predetermined object exists in 
the video; detecting the position of the representative 
point in the frame based on the obtained information; 

detecting the position of the approximate figure in 
accordance with the detected position of the representative 
point ; 

determining whether or not the input position exists in 
the approximate figure; and 

determining that the predetermined object has been 
specified when a determination has been made that the input 
position exists in the approximate figure. 

[Claim 10] An video data processing method for 
determining whether or not a predetermined object has been 
specified in a screen which is displaying a video, the 
method comprising: 

obtaining information describing parameter of 
a function approximating a trajectory obtained by arranging 



- 5 - 



position data of characteristic points of the object region 
in a direction in which frames proceed when an arbitrary 
position has been specified in the screen in a case where 
a region of the predetermined object exists in the videos- 
detecting the positions of the characteristic points in the 
frame in accordance with the obtained informations- 
determining whether or not the distance between the 
input position and the detected position of the 
characteristic point is shorter than a reference value; and 

determining that the predetermined object has been 
specified when a determination has been made that the 
distance is shorter than the reference value. 

[Claim 11] The video data processing method according 
to claims 9 or 10, wherein 

the method comprises showing information related to the 
predetermined object when a determination has been made that 
the predetermined object has been specified. 

[Claim 12] A video data processing method of 
displaying a region in which a predetermined object exists 
when the predetermined object has been specified in a screen 
which is displaying a video, the method comprising: 
obtaining information describing parameter of 
a function approximating a trajectory obtained by arranging 
position data of at least representative points of 
an approximate figure of the object region or characteristic 
points of the object region in a direction in which frames 
proceed when the region of the predetermined object exists 



- 6 - 



in the video; 

detecting the representative point or the charac- 
teristic point in the frame in accordance with the obtained 
information; and 

displaying information for displaying the position of 
the object region in the screen in a predetermined form of 
display in accordance with the detected representative point 
or the characteristic point. 

[Claim 13] An object-region-data generating apparatus 
for generating data about described information of a region 
of an arbitrary object in a video over a plurality of 
continuous frames, the apparatus comprising: 

an approximating means for approximating an object 
region in the video in a plurality of the subject frames by 
using a predetermined figure; 

a detecting means for detecting, in the plural frames, 
coordinate values of the predetermined number of 
representative points identifying the predetermined figure 
which has been used in the approximation; and 

an approximating means for approximating a trajectory 
of a time sequence of the coordinate values of the 
representative points obtained over the plurality of the 
continuous frames with a predetermined function, 

so that information about the object region is 
generated by using the parameter of the function. 

[Claim 14] An object-region-data generating apparatus 
for generating data about described information of a region 



- 7 - 



of an arbitrary object in a video over a plurality of 
continuous frames, the apparatus comprising: 

a detecting means for detecting the coordinate values 
of the predetermined number of characteristic points of 
an object region in a video over the plurality of the 
subject frames, and 

a approximating means for approximating a time 
sequential trajectory of the coordinate values of the 
characteristic points obtained over the plurality of the 
continuous frames with a predetermined function, 

so that information about the object region is 
generated by using the parameter of the function. 

[Claim 15] A data processing apparatus for performing 
a predetermined process when a predetermined object has been 
specified in a screen which is displaying a video, the 
apparatus comprising : 

means for obtaining a parameter of a function 
approximating a trajectory obtained by arranging position 
data of representative points of an approximate figure of 
the object region in a direction in which frames proceed in 
a case where a region of a predetermined object exists in 
the video when an arbitrary position has been specified in 
the screen to detect the position of the representative 
point in the frame; 

a detecting means for detecting the position of the 
approximate figure in accordance with the detected position 
of the representative point; and 
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a determining means for determining whether or not the 
input position exists in the approximate figure. 

[Claim 16] A data processing apparatus for performing 
a predetermined process when a predetermined object has been 
specified in a screen which is displaying a video, the 
apparatus comprising : 

means for obtaining a parameter of a function 
approximating a trajectory obtained by arranging position 
data of a characteristic point of the object region in 
a direction in which frames proceed in a case where the 
region of the predetermined object exists in the video when 
arbitrary position has been specified in the screen to 
detect the position of the characteristic point in the 
frame; and 

a determining means for determining whether or not the 
distance between the input position and the detected 
position of the characteristic point is shorter than 
a reference value. 

[Claim 17] A data processing apparatus for performing 
a predetermined process when a predetermined object has been 
is specified in a screen which is displaying a video, the 
apparatus comprising : 

means for obtaining a parameter of a function 
approximating a trajectory obtained by arranging position 
data of at least a representative point of an approximate 
figure of the object region or a characteristic point of the 
object region in a direction in which frames proceed when 
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the region of the predetermined object exists in the video 
to detect the representative point or .the characteristic 
point in the frame; and 

a displaying for displaying information for indicating 
the position of the object region in the screen in 
a predetermined display form, 
[Detailed Description of the invention] 

[0001] 

[Technical Field of the Invention] 

The present invention relates to a method of describing 
object such that information about an object region in 
a video is described, a method of converting object region 
data and an. apparatus for converting object region data such 
that information about an object region in a video is 
generated, a data processing apparatus for representing 
a related information about an object in a video such as 
Hyper media, and an object region data processing method 
therefor . 

[0002] 

[Prior Art] 

Hyper media are configured such that related 
information called a hyper linJc is given in between mediums, 
such as videos, sounds or texts, to permit mutual reference. 
When videos are mainly used, related information has been 
provided for each object which appears in the video. When 
the object is specified, related information is displayed. 
The foregoing structure is a representative example of the 
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hyper media. The object in the video is expressed by 
a frame number or a time stamp of the video, and information 
for identifying a region in the video which are recorded in 
video data or recorded as individual data. 
[0003] 

Mask images have frequently been used as means for 
identifying a region in a video. The mask image is a bit 
map image constituted by giving different pixel values 
between the inside portion of an identified region and the 
outside portion of the same. A simplest method has 
an arrangement that a pixel value of "1" is given to the 
inside portion of the region and "0" is given to the outside 
portion of the same. Alternatively, a values which are 
employed in computer graphics are sometimes employed. Since 
the a value is usually able to express 256 levels of gray, 
a portion of the levels is used. The inside portion of the 
specified region is expressed as 255, while the outside 
portion of the same is expressed as 0. The latter image is 
called an a map. When the regions in the image are 
expressed by the mask images, determination whether or not 
a pixel in a frame is included in the specified region can 
easily be made by reading the value of the pixel of the mask 
image and by determining whether the value is 0 or 255. The 
mask image has freedom with which a region can be expressed 
regardless of the shape of the region and even 
a discontinuous region can be expressed. The mask image 
must have pixels, the size of which is the same as the size 
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of the original image. 
[0004] 

To reduce the quantity of data of the mask image, the 
mask image is frequently compressed. When the mask image is 
a binary mask image constituted by 0 and 1, a process of 
a binary image can be performed. Therefore, the compression 
method employed in facsimile machines or the like is 
frequently employed. In the case of MPEG-4 in which ISO/IEC 
MPEG (Moving Picture Experts Group) has been standardized, 
an arbitrary shape coding method will be employed in which 
the mask image constituted by 0 and 1 and the mask image 
using the a value are compressed. The foregoing compression 
method is a method using motion compensation and capable of 
improving compression efficiency. On the other hand, 
complex compression and decoding processes are required. 

[0005] 

To express a region in a video, the mask image or the 
compressed mask image has usually been employed. However, 
data for identifying a region is required to permit easy and 
quick extraction, to be reduced in quantity and to permit 
easy handling. 

[0006] 

On the other hand, the hyper media, which are usually 
assumed that an operation for displaying related information 
of a moving object in a video is performed, have somewhat 
difficulty in specifying the object as distinct from 
handling of a still image. A user usually has difficulty in 
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specifying a specific portion. Therefore, it can be 
considered that the user usually aims, for example, 
a portion in the vicinity of the center of the object in 
a rough manner. Moreover, a portion adjacent to the object 
which is deviated from the object is frequently specified 
according to the movement of the object. Therefore, 
data for specifying a region is desired to be adaptable to 
the foregoing media. Moreover, an aiding mechanism for 
facilitating specification of a moving object in a video is 
required for the system for displaying related information 
of the moving object in the video. 
[0007] 

[Objects of the Invention] 

As described above, the conventional method of 
expressing a desired object region in a video by using the 
mas]<: image suffers from a problem in that the quantity of 
data cannot be reduced. The method arranged to compress the 
mask image raises a problem in that coding and decoding 
become too complicated. What is worse, directly accessing 
to the pixel of a predetermined frame cannot be performed, 
causing handling to become difficult. 
[0008] 

There arises another problem in that a device for 
permitting a user to easily instruct a moving object in 
a video has not been provided. 

[0009] 

Accordingly, it is an object of the present invention 
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to provide a method of describing object, an object region 
data converting method, and an object region data converting 
apparatus which are capable of describing a desired object 
region in a video by using a small quantity of data and 
facilitating generation of data and handling of the same. 
[0010] 

Another object of the present invention is to provide 
a method of describing object, an object region 
data converting method, an object region data processing 
method, an object region data converting apparatus, and 
a data processing apparatus with which a user is permitted 
to easily instruct an object in a video and determine the 
object . 

[0011] 

[Means for Achieving the Objects] 

According to one aspect of the present invention, there 
is provided a method of describing object region data such 
that information about an arbitrary object region in a video 
is described over a plurality of continuous frames, the 
method identifying a desired object region in a video 
according to at least either of a figure approximated to the 
object region or a characteristic point of the object 
region; approximating a trajectory obtained by arranging 
positions of representative points of the approximate figure 
or the characteristic points of the object region in 
a direction in which frames proceed with a predetermined 
function; and describing information about the object region 
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by using the parameter of the function. 
[0012] 

According to another aspect of the present invention, 
there is provided a method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method describing the object region data by 
using at least information capable of identifying said 
object, the frame number or the time stamp of a leading 
frame and the frame number or the time stamp of a trailing 
frame of the plurality of the subject frames, information 
for identifying the type of the figure of an approximate 
figure approximating the object region, and the parameter of 
a function with which a trajectory obtained by arranging 
position data of representative points of the approximate 
figure corresponding to the object region in a direction in 
which frames proceed has been approximated. 
[0013] 

According to another aspect of the present invention, 
there is provided a method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method describing the object region data by 
using at least information capable of identifying said 
object, the frame number or the time stamp of a leading 
frame and the frame number or the time stamp of a trailing 
frame of the plurality of the subject frames, the number of 
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approximate figures approximating the object region, 
information for identifying the type of the figure of 
an approximate figure and the parameters of functions with 
which trajectories corresponding to the approximate figures 
and obtained by arranging position data of representative 
points of each approximate figure in a direction in which 
frames proceed have been approximated. 
[0014] 

According to another aspect of the present invention, 
there is provided a method of describing object region 
data such that information about an arbitrary object region 
in a video is described over a plurality of continuous 
frames, the method describing the object region data by 
using information capable of identifying said object, the 
frame number of a leading frame and the frame number of 
a trailing frame of the plurality of the subject frames, and 
the parameter of a function with which a trajectory obtained 
by arranging position data of characteristic points of the 
object region in a direction in which frames proceed has 
been approximated. 
[0015] 

According to another aspect of the present invention, 
there is provided a recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to one of the above methods . 
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[0016] 

According to another aspect of the present invention, 
there is provided a recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to one of the above methods and information 
related to each object or information indicating a method of 
accessing to the related information. 
[0017] 

According to another aspect of the present invention, 
there is provided a recording medium storing object region 
data containing information about regions of one or more 
objects described by the method of describing object region 
data according to one of the above methods and information 
for identifying information related to each object, and 
information related to each object. 
[0018] 

It is desirable to describe information related to the 
object or a method of accessing to the related information. 
[0019] 

According to another aspect of the present invention, 
there is provided a video data processing method for 
determining whether or not a predetermined object has been 
specified in a screen which is displaying a video, the 
method obtaining information describing parameter of 
a function approximating a trajectory obtained by arranging 
position data of representative points of the approximate 
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figure in a direction in which frames proceed when 
an arbitrary position has been specified in the screen in 
a case where a region of the predetermined object exists in 
the video; detecting the position of the representative 
point in the frame based on the obtained information; 
detecting the position of the approximate figure in 
accordance with the detected position of the representative 
point; determining whether or not the input position exists 
in the approximate figure; and determining that the 
predetermined object has been specified when a determination 
has been made that the input position exists in the 
approximate figure . 
[0020] 

According to another aspect of the present invention, 
there is provided a video data processing method for 
determining whether or not a predetermined object has been 
specified in a screen which is displaying a video, the 
method obtaining information describing parameter of 
a function approximating a trajectory obtained by arranging 
position data of characteristic points of the object region 
in a direction in which frames proceed when an arbitrary 
position has been specified in the screen in a case where 
a region of the predetermined object exists in the video; 
detecting the positions of the characteristic points in the 
frame in accordance with the obtained information; 
determining whether or not the distance between the input 
position and the detected position of the characteristic 
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point is shorter than a reference value; and determining 
that the predetermined object has been specified when 
a determination has been made that the distance is shorter 
than the reference value. 
[0021] 

When a determination has been made that the 
predetermined object has been specified, it is desirable to 
show information related to the predetermined object. 
[0022] 

According to another aspect of the present invention, 
there is provided a video data processing method of 
displaying a region in which a predetermined object exists 
when the predetermined object has been specified in a screen 
which is displaying a video, the video processing method 
obtaining information describing parameter of a function 
approximating a trajectory obtained by arranging position 
data of at least representative points of an approximate 
figure of the object region or characteristic points of the 
object region in a direction in which frames proceed when 
the region of the predetermined object exists in the videos- 
detecting the representative point or the characteristic 
point in the frame in accordance with the obtained 
information; and displaying information for displaying the 
position of the object region in the screen in 
a predetermined form of display in accordance with the 
detected representative point or the characteristic point. 
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[0023] 

^-v,o. ^isoect of the present invention. 
According to another aspecr o 

..ere is provide, an oMect-»,ion-.ata .ene.atin. apparatus 
... .eneratin, data a.out describe. in.or.aUon o. a re.on 
o. an arbitrary oMect in a video over a plurality of 
continuous fra.es, t.e o.Ject-re.ion-data .eneratin, 
apparatus oo^prisin. an approximating means for 
Ilroximatin, an o.,ect re.ion in t.e video in a plurality 

tue subject frames by usin. a predetermined fi.ure; 
, detecting means for detecting, in the plural frames, 
coordinate values of t.e predetermined number of 
representative points identifying t.e predetermined frgure 

»a in the approximation; and an approximat- 
vjhioh has been used m tne app 

, trs-iectorv of a time sequence 
ing means for approximating a trajectory 

1 . nf the representative points 
of the coordinate values of the rep 

^„ r,l„ralitv of the continuous frames with 
obtained over the plurality or 

, predetermined function, so that information about the 
Object region is generated by using the parameter of the 



function. 



[0024] 

^ ^-F +-Tnf^ present invention, 
According to another aspect of the pres 

,.ere is provided an object-region-data generating apparatus 
,cr generating data about described information of a region 
cf an arbitrary object in a video over a plurality of 
continuous frames, the object-region-data generating 

■ » detecting means for detecting the 

apparatus comprising a detecting 

coordinate values of the predetermined nun^er of 
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characteristic points of an object region in a video over 
the plurality of the subject frames, and an approximating 
means for approximating a time sequential trajectory of the 
coordinate values of the characteristic points obtained over 
the plurality of the continuous frames with a predetermined 
function, wherein the parameter of the function is used to 
generate information about the object region, 
[0025] 

According to another aspect of the present invention, 
there is provided a data processing apparatus for performing 
a predetermined process when a predetermined object has been 
specified in a screen which is displaying a video, the 
apparatus comprising means for obtaining a parameter of 
a function approximating a trajectory obtained by arranging 
position data of representative points of an approximate 
figure of the object region in a direction in which frames 
proceed in a case where a region of a predetermined object 
exists in the video when an arbitrary position has been 
specified in the screen to detect the position of the 
representative point in the frame; a detecting means for 
detecting the position of the approximate figure in 
accordance with the detected position of the representative 
point; and a determining means for determining whether or 
not the input position exists in the approximate figure. 
[0026] 

According to another aspect of the present invention, 
there is provided a data processing apparatus for performing 
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a predetermined process when a predetermined object has been 
specified in a screen which is displaying a video, the 
data processing apparatus comprising means for obtaining 
a parameter of a function approximating a trajectory 
obtained by arranging position data of a characteristic 
point of the object region in a direction in which frames 
proceed in a case where the region of the predetermined 
object exists in the video when arbitrary position has been 
specified in the screen to detect the position of the 
characteristic point in the frame; and a determining means 
for determining whether or not the distance between the 
input position and the detected position of the 
characteristic point is shorter than a reference value. 
[0027] 

According to another aspect of the present invention, 
there is provided a data processing apparatus for performing 
a predetermined process when a predetermined object has been 
is specified in a screen which is displaying a video, the 
data processing apparatus comprising means for obtaining 
a parameter of a function approximating a trajectory 
obtained by arranging position data of at least 
a representative point of an approximate figure of the 
object region or a characteristic point of the object region 
in a direction in which frames proceed when the region of 
the predetermined object exists in the video to detect the 
representative point or the characteristic point in the 
frame; and a displaying means for displaying information for 
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indicating the position of the object region in the screen 
in a predetermined display form. 
[0028] 

Note that the present invention relating to the 
apparatus may be employed as the method and the present 
invention relating to the method may be employed as the 
apparatus . 

[0029] 

The present invention relating to the apparatus and the 
method may be employed as a recording medium which stores 
a program for causing a computer to perform the procedure 
according to the present invention (or causing the computer 
to serve as means corresponding to the present invention or 
causing the computer to realize the function corresponding 
to the present invention) and which can be read by the 
computer . 

[0030] 

The present invention is configured such that the 
object region in a video over a plurality of frames is 
described as a parameter of a function approximating 
a trajectory obtained by arranging position data of 
representative points of an approximate figure of the object 
region or a characteristic point of the object region in 
a direction in which frames proceed- Therefore, the object 
region in the video over the plural frames can be described 
with a small quantity of the function parameters. Hence it 
follows that the quantity of data required to identify the 
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object region can effectively be reduced. Moreover, 
handling can be facilitated. Moreover, extraction of 
a representative point or a characteristic point from the 
approximate figure or generation of the parameter of the 
approximate curve can easily be performed. Moreover, 
generation of an approximate figure from the parameter of 
the approximate curve can easily be performed. 
[0031] 

When the representative point of the approximate figure 
is employed, a fundamental figure, for example, one or more 
ellipses, are employed such that each ellipse is represented 
by two focal points and another point. Thus, whether or not 
arbitrary coordinates specified by a user exist in the 
object region (the approximate figure) can be determined by 
using a simple discriminant. Hence it follows that the user 
is able to easily instruct a moving object in a video. 
[0032] 

When the characteristic point is employed, whether or 
not the arbitrary coordinates specified by a user indicates 
the object region can considerably easily be determined. 
Thus, a moving object in a video can easily be specified by 
the user. 

[0033] 

When display of an object region among regions of 
objects which can be identified by using object region 
data and which has related information, or display of 
an image indicating the object region is controlled, the 



user is permitted to quickly recognize whether or not 
related information exists and the position of the object 
region. Therefore, the operation which is performed by the 
user can effectively be aided. 
[0034] 

[Embodiments of the Invention] 

Embodiments according to the present invention will now 
be described with reference to the accompanying drawings. 

[0035] 
(First Embodiment) 

FIG. 1 shows the structure of a first embodiment of the 
present invention. As shown in FIG. 1, an object-region- 
data generating apparatus comprises a video data storage 
portion 100, a region extracting portion 101, a portion for 
approximating region with a figure 102, a figure- 
representative-point extracting portion 103, a portion for 
approximating representative point to a trajectory curve 104, 
a related information storage portion 105 and a region 
data storage portion 106. A case will now be described in 
which the process according to this embodiment (in 
particular, processes arranged to be performed by the region 
extracting portion 101 or the region figure approximating 
portion 102) is configured such that the operation which is 
performed by a user is permitted. In the foregoing case, 
the GUI (not shown in FIG. 1) is employed with which video 
data is displayed in, for example, frame units to permit 
input of an instruction from the user. 
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[0036] 

The video data storage portion 100 stores video 
data and comprises, for example, a hard disk, an optical 
disk or a semiconductor memory. 

[0037] 

The region extracting portion 101 extracts a portion of 
regions of video data. The portion of the regions are 
regions of an object, such as a specific person, a vehicle 
or a building (as an alternative to this, a portion of the 
object, for example, the head of a person, the bonnet of 
a vehicle or the front door of a building) in the video. 
Usually a video has the same object in the continuous frames 
thereof. The region corresponding to the same object 
frequently changes owing to the movement of the object or 
shaking of a camera during an image pick-up operation. 

[0038] 

The region extracting portion 101 extracts an object 
region in each frame corresponding to the movement or 
deformation of the object of interest. Specifically, the 
extraction is performed by a method of manually specifying 
a region in each of all of the frames. Another method may 
be employed with which the contour of an object is 
continuously extracted by using an active contour model 
called "Snakes" as disclosed in "Snakes: Active contour 
models" (International Journal of Computer Vision, vol. 1, 
No. 4, pp. 321-331, July, 1988 disclosed by M, Kass et al . ) . 
Also a method disclosed in "Method of tracing high-speed 
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mobile object for producing hyper media contents by using 
robust estimation" (CVIM 113-1, 1998, technical report of 
Information Processing Society of Japan) may be employed. 
According to the disclosure, deformation and movement of the 
overall body of an object are estimated in accordance with 
a position to which a partial object region has been moved 
and which has been detected by performing block matching. 
Alternatively, a method of identifying a region having 
similar colors by performing growing and division of 
a region as disclosed in Image Analysis Handbook (Chapter-2, 
Section II, Publish Conference of Tokyo University, 1991) 
may be employed. 
[0039] 

The portion for approximating region with a figure 
(hereinafter called a "region figure approximating portion") 
102 approximates an object region in a video extracted by 
the region extracting portion 101 with a predetermined 
figure. The figure may be an arbitrary figure, such as 
a rectangle, a circle, an ellipse or a polygon. Also 
a method of approximating a region may be a method of 
performing approximation to a figure circumscribing the 
region. Another method of performing approximation to 
a figure inscribing the region may be employed or a method 
may be employed which is arranged such that the centroid of 
the region is employed as the centroid of the approximate 
figure. Another method of making the area ratio of the 
region and the approximate figure to be the same may be 



employed. As an alternative to the approximation of the 
object region with a predetermined type figure, the type of 
the figure may be specified by a user for each object to 
which approximation is performed. Another method may be 
employed with which the type of the figure is automatically 
selected in accordance with the shape of the object or the 
like for each of the object to which approximation is 
performed . 

[0040] 

The approximation of the region with the figure is 
performed for each frame whenever a result of extraction 
performed by the region extracting portion 101 is input. 
Alternatively, approximation with a figure may be performed 
by using a result of extraction of a plurality of preceding 
and following frames. When the result of extraction of the 
plural frame is employed, change in the size and position of 
the approximate figure is smoothed among the plural frames 
so that the movement and deformation of the approximate 
figure are smoothed or an error in the extraction of the 
region is made to be inconspicuous. Note that the size of 
the approximate figure may vary among the frames. 

[0041] 

The figure-representative-point extracting portion 103 
extracts representative points of the approximate figure 
which is an output of the region figure approximating 
portion 102, The point which is employed as the 
representative point varies according to the type of the 
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employed approximate figure. When the approximate figure is 
formed into, for example, rectangle, the four or three 
vertices of the rectangle may be the representative points. 
When the approximate figure is formed into a circle, the 
representative points may be the center and one point on the 
circumference or two end points of the diameter. When the 
approximate figure is an ellipse, the representative points 
may be the vertex of a circumscribed rectangle of the 
ellipse or the two focal points and one point on the ellipse. 
When an arbitrary closed polygon is the approximate figure, 
the vertices may be the representative points of the figure. 
[0042] 

The representative points are extracted in frame units 
whenever information about the approximate figure for one 
frame is output from the region figure approximating portion 
102. Each representative point is expressed by the 
coordinate axis in the horizontal X direction and the 
coordinate axis in the vertical Y direction. 

[0043] 

The portion for approximating representative point to 
a trajectory curve (hereinafter called a "representative 
point trajectory curve approximating portion") 104 time- 
sequentially approximates the sequence of the representative 
points extracted by the figure-representative-point 
extracting portion 103 to a curve. The approximate curve is, 
for each of the X coordinate and Y coordinate of each 
representative point, expressed as a function of the frame 
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number f or time stamp t given to the video. The 
approximation with the curve may be approximation with 
a straight line or approximation with a spline curve. 
[0044] 

The related information storage portion 105 stores 
information (as an alternative to this, information about 
the address at which related information stored in another 
storage apparatus, for example, Internet or a server on 
a LAN) relating to the object which appears in video 
data stored in the video data storage portion 100. Related 
information may be a character, voice, a still image, 
a moving image or their combination. Alternatively, related 
information may be data describing the operation of 
a program or a computer. Similarly to the video 
data storage portion 100, the related information storage 
portion 105 comprises a hard disk, an optical disk or 
a semiconductor memory. 

[0045] 

The region data storage portion 106 is a storage medium 
in which object region data is stored which includes 
data for expressing a formula of the curve approximating the 
time-sequential trajectory of the representative points 
which is the output of the representative point trajectory 
curve approximating portion 104, When related information 
about the object corresponding to the region expressed by 
a function has been stored in the related information 
storage portion 105, object region data may include related 



- 30 - 



information and the address at which related information has 
been recorded. When information of the address of recorded 
related information has been stored in the related 
information storage portion 105, also address information 
may be recorded. Similarly to the video data storage 
portion 100 and the related information storage portion 105, 
the region data storage portion 106 comprises a hard disk, 
an optical disk or a semiconductor memory* 
[0046] 

The video data storage portion 100, the related 
information storage portion 105 and the region data storage 
portion 106 may be constituted by individual pieces of 
storage apparatus. Alternatively, the overall portion or 
a portion may be constituted by one storage apparatus. 

[0047] 

The object-region-data generating apparatus may be 
realized by a software which is operated on a computer. 
[0048] 

The operation of the object-region-data generating 
apparatus will specifically be described. 
[0049] 

FIG. 2 shows diagrams more specifically showing 
a sequential process. The sequential process includes 
a process which is performed by the region extracting 
portion 101 to extract the object region. Moreover, 
a process which is performed by the region figure 
approximating portion 102 to approximate the region and 
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a process which is performed by the figure-representative- 
point extracting, portion 103 to extract a representative 
point of a figure are included. Also a process which is 
performed by the representative point trajectory curve 
approximating portion 104 to approximate the representative 
point trajectory with a curve is included. 
[0050] 

In FIG. 2f the region figure approximating portion 102 
employs a method of approximating the region with an ellipse. 
The figure-representative-point extracting portion 103 
employs a method of extracting the two focal points of the 
ellipse and one point on the ellipse. The representative 
point trajectory curve approximating portion 104 employs 
a method of approximating the sequence of the representative 
points with a spline function. 

[0051] 

Referring to FIG. 2(a), reference numeral 200 
represents a video of one frame which is to be processed. 
[0052] 

Reference numeral 201 represents the object region 
which is to be extracted. A process for extracting the 
object region 201 is performed by the region extracting 
portion 101. 

[0053] 

Reference numeral 202 represents an ellipse which is 
a result of approximation of the object region 201 with, 
an ellipse. A process for obtaining the ellipse 202 from 
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the object region 201 is performed by the region figure 
approximating portion 102. 
[0054] 

FIG. 3 shows an example of the method of obtaining 
an approximate ellipse when the object region is expressed 
by a parallelogram. Points A, B, C and D shown in FIG. .3 
are vertices of the parallelogram which is the object region. 
In the foregoing case, calculations are performed so that 
which side AB or side BC is a longer side is determined. 
Then, a smallest rectangle having portions of its sides 
which are the longer side and its opposite side is 
determined. In the case shown in FIG. 3, a rectangle having 
four points A, B', C and D' is the smallest rectangle. The 
approximate ellipse is a circumscribing ellipse similar to 
the ellipse inscribing the rectangle and passing the points 
A, B' , C and D' . 

[0055] 

Referring to FIG. 2(b), reference numerals 203 
represent representative points of a figure expressing 
an ellipse. Specifically, the representative points are two 
focal points of the .ellipse and one point on the same. The 
focal points of the ellipse can easily be determined from 
points on the two axes or a circumscribing rectangle of the 
ellipse. An example will now be described with which focal 
points F and G are determined from two points Pq and P]_ on 
the major axis and point H on the minor axis shown FIG. 4. 
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[0056] 

Initially, a and b which are parameters of the major 
axis and the minor axis, center C of the ellipse and 
eccentricity e are determined as follows: 

E (Pq, Pi) = 2 X a 

C = (Po + Pi)/2 

E (C, H) = b 

e = (1/a) X ^J(a x a - b x b) 
where E (P, Q) is the Euclidean distance between the point P 
and the point Q. 

[0057] 

In accordance with the determined parameters, the focal 
points F and G can be determined as follows: 
F = C + e X (Pq - c) 
G = C + e X (Pq - c) 
[0058] 

The foregoing process for extracting the representative 
points from the ellipse is performed by the figure- 
representative-point extracting portion 103. 

[0059] 

The representative points extracted by the foregoing 
process are usually varied in the position among the 
successive frames owing to movement of the object of 
interest in the video or shaking of the image pick-up camera. 
Therefore, the corresponding representative points, of the 
ellipses are time-sequentially arranged to perform 
approximation with a spline function for each of the X and Y 
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axes. In this embodiment, each of the three points F, G and 
H (see FIG. 4) which are the representative points of the 
ellipse requires a spline function for the X and Y 
coordinates. Therefore, six spline functions are produced. 
[0060] 

The approximation to a curve with spline functions is 
performed by the representative point trajectory curve 
approximating portion 104. 
[0061] 

The process which is performed by the representative 
point trajectory curve approximating portion 104 may be 
carried out whenever the coordinates of the representative 
points of each frame relating to the object region are 
obtained. For example, the approximation is performed 
whenever the coordinates of the representative points in 
each frame are obtained. Moreover, an approximation error 
is obtained to arbitrarily divide the approximation region 
in such a manner that the approximation error satisfies 
a predetermined range. Another method may be employed with 
which the process is performed after the coordinates of the 
representative points in all of the frames relating to the 
object region have been obtained. 

[0062] 

Reference numeral 204 shown in FIG. 2(c) represents the 
approximated spline function expressed three-dimensionally . 
Reference numeral 205 shown in FIG. 2(d) represents 
an example of the spline function which is the output of the 
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representative point trajectory curve approximating portion 
104 (only one axis of coordinate of one representative point 
is shown) . In this example, the approximation region is 
divided into two sections (the niimber of knots is two) which 
are t = 0 to 5 and t = 5 to 16. 
[0063] 

The thus-obtained spline functions are recorded in the 
region data storage portion 106 in a predetermined 
data format. 

[0064] 

As described above, this embodiment enables the object 
region in a video to be described as the parameter of 
a curve approximating a time-sequential trajectory 
(a trajectory of the coordinates of the representative 
points having the variable are the frame numbers or the time 
stamps) of the representative points of the approximate 
figure of the region. 
[0065] 

According to this embodiment, the object region in 
a video can be expressed by only the parameters of the 
function. Therefore, object region data, the quantity of 
which is small and which can easily be handled, can be 
produced. Also extraction of representative points from the 
approximate figure and producing of parameters of the 
approximate curve can easily be performed. Moreover, 
producing of an approximate figure from the parameters of 
the approximate curve can easily be performed. 
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[0066] 

A method may be employed with which a basic figure,, for 
example, one or more ellipses are employed as the 
approximate figures and each ellipse is represented by two 
focal points and another point. In the foregoing case, 
whether or not arbitrary coordinates specified by a user 
exist in the region (the approximate figure) of the object 
(whether or not the object region has been specified) can be 
determined by a simple determinant. Thus, specification of 
a moving object in a video can furthermore easily be 
performed by the user. 
[0067] 

The data format of object region data which is stored 
in the region data storage portion 106 will now be described. 
A case will now be described in which the representative 
points are approximated with a spline function • As a matter 
of course, a case in which the representative points are 
approximated with another function is performed similarly. 

[0068] 

FIG. 5 shows an example of the data format of object 
region data for recording the spline function indicating the 
object region in a video and information related to the 
object. 

[0069] 

ID number 4 00 is an identification number which is 
given to each object. 
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[0070] 

A leading frame, number 401 and a trailing frame nimiber 
402 are leading and trailing frame numbers for defining 
existence of the object having the foregoing ID number. 
Specifically, the numbers 401 and 402 are the frame number 
at which the object appears in the video and the frame 
number at which the object disappears- The frame numbers 
are not required to be the frame numbers at which the object 
actually appears and disappears in the video. For example, 
an arbitrary frame number after the appearance of the object 
in the video may be the leading frame number. An arbitrary 
frame number which follows the leading frame number and 
which precedes the frame of disappearance of the object in 
the video may be the trailing frame number. The 
leading/trailing time stamp may be substituted for the 
lading/trailing frame number. 

[0071] 

A pointer (hereinafter called a "related information 
pointer") 403 for pointing related information is the 
address or the lilce of the data region in which data of 
information related to the object having the foregoing ID 
number. When the related information pointer 403 for 
pointing related information is used, retrieval and display 
of information related to the object can easily be performed. 
The related information pointer 403 for pointing related 
information may be pointer for pointing data of description 
of a program or the operation of a computer. In the 
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foregoing case, when the object has been specified by a user, 
the computer performs a predetermined operation. 
[0072] 

The operation for describing the related information 
pointer 403 for pointing related information in the object 
region data will now be described. As an alternative to 
using the pointer 403, related information itself may be 
described in object region data. The related information 
pointer 403 for pointing related information and related 
information may be described in object region data. In the 
foregoing case, a flag is required to indicate whether the 
related information pointer for pointing related information 
or related information has been described in object region 
data . 

[0073] 

The approximate figure number 404 is the number of the 
figures approximating the object region. In the example 
shown in FIG. 2 in which the object region is approximated 
with one ellipse, the number of the figures is 1. 

[0074] 

Approximate figure data 405 is data (for example, the 
parameter of a spline function) of a trajectory of the 
representative point of the figure for expressing 
an approximate figure. 

Note that approximate figure data 405 exists by the 
number corresponding to the approximate figure number 404 
(a case where the approximate figure number 404 is two or 
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larger will be described later) , 

The number of the approximate figure number 404 for . 
object region data may always be one (therefore, also 
approximate figure data 405 is always one) to omit the field 
for the approximate figure number 4 04. 

[0075] 

FIG. 6 shows the structure of approximate figure 
data 405. 

[0076] 

A figure type ID 1300 is identification data for 
indicating the type of a figure serving as the approximate 
figure, the figure type ID 1300 being arranged to identify 
a circle, an ellipse, a rectangle and a polygon. 
[0077] 

A representative point number 1301 indicates the number 
of representative points of the figure specified by the 
figure type ID 1300. 
[0078] 

A pair of representative point trajectory data items 
1302 and 1303 are data regions relating to the spline 
function for expressing the trajectory of the representative 
points of the figure. The representative points of one 
figure require data of one pair of spline functions for the 
X and Y coordinates. Therefore, data of the trajectory of 
the representative points for specifying the spline function 
exists by representative point number (M) x 2. 
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[0079] 

Note that the type of the employed approximate figure, 
may previously be limited to one type, for example, 
an ellipse. In the foregoing case, the field for the figure 
type ID 1300 shown in FIG. 6 may be omitted. 

[0080] 

When the representative point number is defined 
according to the figure type ID, the representative point 
number may be omitted. 

[0081] 

FIG. 7 shows an example of the structure of 
representative point trajectory data 1302 and 1303. 
[0082] 

A knot frame number 1400 indicates the knots of the 
spline function. Thus, a fact that polynomial data 1403 is 
effective to the knots is indicated. The number of 
coefficient data 1402 of the polynomial varies according to 
the highest order of the spline function (assuming that the 
highest order is K, the number of coefficient data is K + 1) . 
Therefore, reference to a polynomial order 1401 is made. 
Subsequent to the polynomial order 1401, polynomial 
coefficients 1402 by the number corresponding to the 
polynomial order (K) + 1 follows. 
[0083] 

Since the spline function is expressed in an individual 
polynomial among~the— k-nots., the polynomials are required by 
the number corresponding to the number of knots. Therefore, 
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data 1403 including the knot frame number and the 
coefficient of the polynomial is described repeatedly. When 
the knot frame number is the same as the trailing end frame, 
it means the trailing end polynomial coefficient data. 
Therefore, termination of representative point trajectory 
data can be understood. 
[0084] 

A case will now be described in which a figure except 
for the ellipse is employed as the approximate figure. 
[0085] 

FIG. 8 is diagram showing the representative points in 
a case where a parallelogram is employed as the approximate 
figure. Points, A, B, C and D are vertices of the 
parallelogram. Since three points of the four vertices are 
determined, the residual one is determined. Therefore, 
three vertices among the four vertices are required to serve 
as the representative points. In the foregoing example, 
three points, which are A, B and C, are employed as the 
representative points . 

[0086] 

FIG. 9 is a diagram showing representative points in 
a case where a polygon is employed to serve as the 
approximate figure. In the case of the polygon, the order 
of the vertices is made to be the order along the outer 
surface. Since the example shown in FIG. 9 has 10 vertices, 
all of the vertices N]_ to N]_o are employed as the 
representative points. In the foregoing case, the number of 
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the vertices may be reduced by employing only vertices each 
having an internal angle smaller than 180° as the 
representative points • 
[0087] 

As described above, the foregoing process may be 
performed by software which is operated on a computer, 
FIG. 10 is a flowchart showing the process which is 
performed by the video processing apparatus according to 
this embodiment. When the video processing apparatus 
according to this embodiment is realized by software, 
a program according to the flowchart shown in FIG. 10 is 
produced. 

[0088] 

In step Sll, video data for one frame is extracted from 
the video data storage portion 100. 
[0089] 

In step S12, the region of a predetermined object in 
the video is extracted. Extraction may be performed by 
a method similar to that employed by the region extracting 
portion 101 . 

[0090] 

In step S13, an approximate figure is approximated to 
region data which is a result of the process performed in 
step S12. The approximation method may be similar to that 
employed by the region figure approximating portion 102. 

[0091] 

In step S14, the representative points of the figure 
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approximated in step S13 is extracted, 
[0092] 

In step S15, approximation of the position of 
a representative point train of the approximate figure in 
the successive frame with a curve is performed. 

[0093] 

In step S16, a branching process is performed. Thus, 
determination is made whether or not the processed image is 
in the final frame or whether or not the object in the 
processed frequency which is to be extracted has disappeared 
from the image (or considered that the object has 
disappeared) . In an affirmative case, the process is 
branched to step S17. In a negative case (both of the cases 
are negated), the process is branched to step Sll. 

[0094] 

In step S17, the approximate curve calculated in step 
S15 is recorded in a recording medium as object region 
data in accordance with a predetermined format. 

[0095] 

The example has been described with which one figure is 
assigned to one object to roughly express the object region. 
The accuracy of approximation may be improved by malcing 
approximation to the region of one object with a plurality 
of figures. FIG. 11 shows an example in which a plurality 
of figures are approximated to one object. In the foregoing 
case, a region of a person in the image is expressed with 6 
ellipses 600 to 605. 
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[0096] 

When one object is expressed with the plural figures as 
shown in FIG. 11, a process for dividing the object into 
a plurality of regions must be performed. The process may 
be performed by an arbitrary method. For example, a method 
with which the object is directly divided with manpower may 
be employed. In the foregoing case, a pointing device, such 
as a mouse, is used to, on the image, enclose the region 
with a rectangle or an ellipse. Alternatively, the region 
is specified with a trajectory of the pointing device. When 
an automatic method is employed as a substitute for the 
manpower, a method may be employed with which clustering of 
movement of the object is performed to realize the division. 
The foregoing method is a method with which the movement of 
each region in the object among the successive frames is 
determined by a correlation method (refer to, for example. 
Image Analysis Handbook Chapter-3, Section II, Publish 
Conference of Tokyo University, 1991) or a method with 
gradient constraints (refer to, for example. Determining 
optical flow, B. K. P. Horn and B. G. Schunck, Artificial 
Intelligence, Vol. 17, pp. 185-203, 1981) to collect similar 
movements to form a region. 

[0097] 

Each of the divided regions is subjected to the process 
which is performed by the example of the structure shown in 
FIG. 1 or the procedure shown in. FIG. 10 so that data of the 
approximate figure is produced. In the foregoing case, the 
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spline function, which must be described in object region 
data of one object increases as the number of the 
approximate figures increases. Therefore, the structure of 
data is formed which includes approximate figure data 4 05 by 
the number (L in the foregoing case) corresponding to the 
approximate figure number 404, as shown in FIG. 12. 
[0098] 

As described above, the field for the approximate 
figure number 404 may be omitted by making the approximate 
figure nimber to always be one (therefore, data of the 
approximate figure is made to always be one) to the object 
region data. In the foregoing case, one object can be 
expressed with a plurality of figures when object region 
data is produced for each figure approximating one object 
(the same ID number is given) . 
[0099] 

When one object is expressed with a plurality of 
figures in this embodiment, the same figure is employed. 
A mixture of a plurality types of figures may be employed. 
[0100] 

Variation of a method of use of region data produced 
and recorded in this embodiment will now be described. 
Although a person, an animal, a building or a plant is 
considered as the object in a video, the process according 
to this embodiment may be applied to any object in the video. 
For example, a telop may be handled as an object in a video. 
Therefore, a process in which a telop is employed as the 
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variations of the object will now be described. 
[0101] 

The telop is character information added to the image. 
In U.S. character information called a "closed caption" must 
be added. In broadcasts in Japan frequencies of use of 
telops have been increased. The telop which must be 
displayed includes a moving telop, such as a still telop, 
a telop which is scrolled upwards in the screen and a telop 
which is scrolled from right to the left of the screen. 
When the region in which the telop is being displayed is 
approximated with a figure to store the telop character 
train as related information, the contents of the image can 
easily be recognized or a predetermined image can easily be 
retrieved. 

[0102] 

The region extracting portion 101 performs a process by 
employing a method with which a telop region is manually 
specified. Another method may be employed which has been 
disclosed in "Method of Extracting Character Portion from 
Video to Recognize Telop" (Hori, 99-CVIM-114, pp. 129-136, 
1999, "Information Processing Society of Japan Technical 
Report") and. with which the brightness and edge information 
of characters are employed to perform character train 
extracting method. TVnother method has been disclosed in 
"Improvement in Accuracy of Newspaper Story Based on Telop 
Character Recognition of News Video" (Katayama et al . Vol. 1, 
pp. 105-110, proceedings of Meeting on Image (Recognition 



and Understanding (MIRU '98)) to separate background and the 
telop from each other by examining the intensity of edges* 
Thus, the telop region is extracted. Each character and 
each character train may be cut from the obtained telop 
region. Edge information in the telop. region in successive 
frames is compared with each other to detect a frame in 
which the telop has appeared and a frame in which the same 
has disappeared. 
[0103] 

The region figure approximating portion 102 performs 
a process to approximate the telop region extracted by the 
region extracting portion 101 with a rectangle. The number 
of the frequency in which the telop has appeared is stored 
in the leading frame number of object region data (401 shown 
in FIG. 4 or FIG. 12). On the other hand, the frame in 
which the telop has disappeared is stored in the trailing 
frame number 402. A pointer for pointing the character 
train information of the telop is stored in the related 
information pointer 403 for pointing related information. 
As approximate figure data 405, rectangular region 
data encircling the telop is stored. When each row of 
a telop composed of a plurality of rows is made to be 
an individual region or when each character is made to be 
an individual region, the number of rows or characters is 
stored in the approximate figure number 404. Rectangular 
region data encircling each row or character, that is, 
approximate figure data 405, is stored by the corresponding 
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number , 

. [0104] 

The figure-representative-point extracting portion 103 
and the representative point trajectory curve approximating 
portion 104 perform processes as described above because any 
specialized portion for the telop is included in the 
processes . 

[0105] 

The character train information of the telop which has 
appeared is stored in the related information storage 
portion 105. Moreover, the pointer for pointing information 
above is stored in telop region data (object region data) . 

[0106] 

When a keyword has been input and a character train 
corresponding or relating to the keyword is included in the 
character train information of the telop, the frame and time 
at which the character train appears can easily be detected. 
If the image is a news program, retrieval of interesting 
articles is permitted to look only the articles. 
[0107] 

In the foregoing case, addition of a pointer for 
pointing object region data corresponding to the frame or 
time to the character train information of the telop 
facilitates the retrieval. 

[0108] 

Thus, the telop is processed as described above. 
Variations of the object may be applied to the method of 
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using this embodiment. 
[0109] 

A method of providing video data and object region 
data will now be described. 
[0110] 

When object region data produced owing to the process 
according to this embodiment is provided for a user, 
a creator must provide object region data for the user by 
a method of some kind. The object region data may be 
provided by any one of the following methods. 

(1) A method with which video data, its .object region 
data and its related information are recorded in one (or 

a plurality of) recording medium so as to simultaneously be 
provided. 

(2) A method with which video data and object region 
data are recorded in one (or a plurality of) recording 
medium so as to simultaneously be provided. However, 
related information is individually provided or provision of 
the same is not performed (the latter case is a case in 
which related information can individually be acquired 
through a network or the like if provision is not performed) . 

(3) A method with which video data is solely provided. 
Moreover, object region data and related information are 
recorded in one (or a plurality of) recording medium so as 
to simultaneously be provided. 

(4) A method with which video data, object region 
data and related information are individually provided. 
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The recording mediiim is mainly used to perform 
provision in the foregoing case. Another method may be 
employed with which a portion or the overall portion of 
data is provided from a communication medium. 

[0111] 
(Second Embodiment) 

The first embodiment has the structure that the 
representative points of a figure approximating the object 
region in a video is extracted so as to be converted into 
object region data. On the other hand, a second embodiment 
has a structure that characteristic points in the object 
region in the video are extracted so as to be converted into 
object region data. 

[0112] 

Description will be made about the different structures 
from those according to the first embodiment. 
[0113] 

FIG. 13 shows an example of the structure of an object- 
region-data converting apparatus according to this 
embodiment. As shown in FIG. 13, the object-region- 
data generating apparatus according to this embodiment 
incorporates a video data storage portion 230, 
a characteristic-point extracting portion 233, 
a characteristic-point-curve approximating portion 234. for 
approximating the arrangement of characteristic points with 
a curve, a related information storage portion 235 and 
a region data storage portion 236. 
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[0114] 

Referring to FIG, 13, a video data storage portion 230 
has the same function as that of the video data storage 
portion 100 according to the first embodiment. The related 
information storage portion 235 has the same function as 
that of the related information storage portion 105 
according to the first embodiment. The region data storage 
portion 236 has the same function as that of the region 
data storage portion 106 according to the first embodiment. 

[0115] 

The characteristic-point extracting portion 233 
extracts at least one characteristic point from the object 
region in the video. The characteristic point may be any 
one a variety of points. For example,, corners of an object 
(for example, a method disclosed in "Gray-level corner 
detection, L. Kitchen and A. Rosenfeld, Pattern Recognition 
Letters, No. 1, pp. 95-102, 1982) or the centroid of the 
object may be employed. When the centroid of the object is 
employed as the characteristic point, it is preferable that 
the portion around the point assumed as the centroid is 
specified and then automatic extraction is performed. 
[0116] 

The characteristic-point-curve approximating portion 
234 has a basic function similar to that of the 
representative point trajectory curve approximating portion 
104 according to the first embodiment. That is, the 
characteristic-point-curve approximating portion 234 
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time-sequentially approximates, to a curve, the positions of 
the characteristic points extracted by the characteristic- 
ppint extracting portion 233, The approximate curve is, for 
each of the X and Y coordinates, expressed as the function 
of the frame number f or the time stamp t given to the video 
so as to be approximated with a curve by linear 
approximation or approximation using a spline curve. 
Data after the approximation has been performed is recorded 
by a method similar to that according to the first 
embodiment. 
[0117] 

Note that object region data according to this 
embodiment is basically similar to object region 
data according to the first embodiment (see FIG. 5) . The 
field for the approximate figure number is not required. 
Note that "data of the approximate figure" is "data of 
characteristic points" . 
[0118] 

Also data of the characteristic point in object region 
data is basically similar to data of the approximate figure 
according to the first embodiment (see FIG. 6) . Note that 
the "number of representative points" is the "number of 
characteristic points". The "data of the trajectory of 
representative points" is the "data of the trajectory of 
characteristic points". Note that figure type ID is not 
required. 
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[0119] 

Data of the trajectory of the characteristic points 
included in the data of the characteristic points is similar 
to data of the trajectory of the representative points 
according to the first embodiment (see FIG. 7), 

[0120] 

FIG. 14 is a flowchart showing an example of a flow of 
the process of the object-region-data converting apparatus 
according to this embodiment. The overall flow is similar 
to that according to the first embodiment. Steps S12 to S14 
shown in FIG. 10 are made to be step for extracting the 
characteristic points of the object of interest. The 
representative point in step S15 shown in FIG. 10 is made to 
be the characteristic point. 

[0121] 

As a matter of course, the process according to this 
embodiment can be realized by software. 
[0122] 

As described above, the structure according to this 
embodiment is able to describe the object region in a video 
as a parameter of a curve approximating the time-sequential 
trajectory (the trajectory of the coordinates of the 
characteristic points having the frame numbers or time 
stamps as the variables) of the characteristic points of the 
region . 

[0123] 

Since this embodiment enables the object region in 
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a video to be expressed with only the parameters of the 
function, object region data, the quantity of which can be. 
reduced and which can easily be handled, can be generated. 
Moreover, expression of the characteristic points and 
generation of the parameters of the approximate curve can 
easily be performed. 
[0124] 

According to this embodiment, whether or not arbitrary 
coordinates specified by a user indicate the object region 
can considerably easily be determined. Moreover, it leads 
to a fact that specification of a moving object in a video 
can furthermore easily be performed. 
[0125] 

Note that object region data based on the 
representative points of the approximate figure of the 
object region according to the first embodiment and object 
region data based on the characteristic points of the object 
region according to the second embodiment may be mixed with 
each other. 

In the foregoing case, the format of object region 
data according to the first embodiment is provided with 
a flag for identifying a fact that object region data is 
based on the representative points of the approximate figure 
-of the object region or the characteristic points of the 
object region. As an alternative to providing the flag for 
the format of object region data according to the first 
embodiment, when the figure type ID has a specific value. 
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a fact that object region data is based on the 
characteristic points of the object region is indicated. In 
the other cases, a fact is indicated that object region 
data is based on the representative points of the 
approximate figure of the object region. 
[0126] 

The structure of object region data and a creating side 
have been described. The portion for using the above- 
mentioned object region data will now be described. 

[0127] 
(Third Embodiment) 

in the third embodiment, when object region 
data including related information has been given to the 
object in the video, a user specifies an object (mainly on 
a GUI screen) to display related information (display of 
characters, a still image or a moving image, or output of 
sound) or causes a related program to be executed. 

[0128] 

FIG. 15 shows an example of the structure of 
a data processing apparatus according to this embodiment. 
As shown in FIG. 15, the data processing apparatus according 
to this embodiment incorporates a video data display portion 
301, a control unit 302, a related information display 
portion 303 and an instruction input portion 304. 

[0129] 

The video data display portion 301 displays video 
data input from a recording medium or the li]ce (not shown) 
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on a liquid crystal display unit or a CRT. 
[0130] 

The instruction input portion 304 permits a user to use 
a pointing device, such as a mouse, or a keyboard to perform 
an operation, for example, specification of an object in the 
video displayed on the liquid crystal unit or the CRT. 
Moreover, the instruction input portion 304 receives input 
from the user. 

[0131] 

The control unit 302, to be described later, determines 
whether or not the user has specified the object in the 
video in accordance with, for example, the coordinates 
specified by the user on the screen and object region 
data input from a recording medium (not shown) . Moreover, 
the control unit 302 makes a reference to the pointer for 
pointing related information of object region data when 
a determination has been made that the user has specified 
a certain object in the video. Thus, the control unit 302 
acquires related information of the object to display the 
related information . 

[0132] 

The related information display portion 303 responds to 
the instruction issued from the control unit 302 to acquire 
and display related information (from a recording medium or 
a server or the like through a network) . 

[0133] 

When the pointer for pointing related information is 
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a pointer for pointing data in which program or the 
operation of the computer is described, the computer 
performs a predetermined operation. 
[0134] 

As a matter of course, also this embodiment may be 
realized by software. 
[0135] 

A process which is performed when the object region is 
expressed as an approximate figure similarly to the first 
embodiment will now be described. 
[0136] 

FIG. 16 shows an example of the process according to 
this example. The flowchart shown in FIG. 16 includes only 
a process which is performed when a certain region in 
a video which is being displayed during reproduction of the 
video is specified by using a pointing device, such as 
a mouse cursor (basically corresponding to the process which 
is performed by the control unit 302) . 

[0137] 

In step S31, the coordinates on the screen specified by 
using the pointing device or the like are calculated. 
Moreover, the frame number of the video which is being 
reproduced at the moment of the instruction is acquired. 
Note that a time stamp may be employed as a substitute for 
the frame number (hereinafter the frame number is employed) . 

[0138] 

In step S32, the object existing in the video having 
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the frame number in which the object has been specified is 
selected from object region data of the object added to the. 
video. The foregoing selection can easily be performed by 
making a reference to the leading frame number and the 
trailing frame number of object region data. 
[0139] 

In step S33, data of a spline function (see FIGS. 6 and 
7) extracted from object region data of the region selected 
in step S32 is used to calculate the coordinates of the 
representative points of the approximate figure in the video 
display frame number when the object has been specified. 
Thus, two focal points F and G and point H on the ellipse 
are obtained in the example according to the first 
embodiment (see FIGS. 2 and 4) . 

[0140] 

In step S34, it is determined whether or not the 
coordinates specified by using the pointing device or the 
like exist in the object (that is, the approximate figure) 
in accordance with the discrimination procedure which is 
decided according to the obtained representative points and 
the figure type ID of object region data. 

[0141] 

When the ellipse is represented by the two focal points 
and one point on the ellipse similarly to the first 
embodiment, the determination can easily be made. 

[0142] 

When, for example, the Euclidean distance between 



points P and point Q is expressed by E (P, Q) similarly to 
the first embodiment, the following inequality is held in 
a case where the coordinate P specified by using the 
pointing device exists in the ellipse: 

E (F, P) + E (G, P) ^ E (F, H) + E (G, H) 

[0143] 

In a case where the coordinate P exists on the outside 
of the ellipse, the following inequality is held: 

E (F, P) + E (G, P) > E (F, H) + E (G, H) 

The foregoing inequalities are used to determine 
whether or not the specified point exists in the object. 
Then, it is determined whether step S35 is performed or 
omitted (skipped) in accordance with a result of the 
determination . 

[0144] 

When a parallelogram is employed as the approximate 
figure of the object region in the video, four inequalities 
are used as a substitution for one inequality in the case of 
the ellipse to determine whether or not the arbitrary 
coordinates exist in the object. 

[0145] 

When, for example, points A, B and C shown in FIG. 8 
are representative points, point D is obtained as follows: 
D = C + A - B 

Then, an assumption is made that a point on a straight 
line passing through the points A and B is Q and the 
straight line is expressed by the following equation: 
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fA,B(Q) = 0 

When the point P exists in the figure, the following 
four inequalities are held: 
fA,B(P) ^ 0 
fc,D(P) ^ 0 
fB,c(P) ^ 0 
fD,A(P) ^ 0 

where the constant term of the equation fA, B^^)^*^ 
larger than the constant term of the equation f (P) =0, and 
the constant term of the equation fg^Q(P)=0 is lager than 
the constant term of the equation fD^A^^)^^- 

When the coordinate of point P is (x, y) , fA, b(^) 
obtained by following equation. 

y - ya - (va - yb) / i^A - ^b) ■ (X - xa) =0 

[0146] 

When approximation to one object with a plurality of 
approximate figures is made (refer to the approximate figure 
number shown in FIGS. 5 and 12), the foregoing process is 
performed for each approximate figure. 

[0147] 

In step S35, a process which is performed only when the 
specified point exists in the object region. In the 
foregoing case/ a reference to "the related information 
pointer" contained in object region data is made. In 
accordance with information about the pointer, related 
information is acquired so as to be, for example, displayed 
(in the example of the structure shown in FIG. 15, the 
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foregoing process is performed by the related information 
display portion 303) • When a program has been specified as 
related information, an specified program is executed or 
another specified operation is performed. When related 
information has been described in object region data, 
related information is required to be displayed. 
[0148] 

FIG. 17 shows an example of a case where description 
(a text) of an object in a video has been given as the 
related information. When the coordinates specified by 
using the pointing device 802 during reproduction of a video 
800 exist in the object region 801 (a figure approximating 
the object 801), related information 803 is displayed. 

[0149] 

In step S36, a branching process is performed so that 
it is determined whether or not an object having object 
region data furthermore exists in the frame in which the 
object has been specified. If the object exists, the 
process proceeds to step S32. If the object does not exist, 
the operation is completed. 

[0150] 

A process which is. performed when the object region is 
expressed as characteristic points of the object similarly 
to the second embodiment will now be described. 
[0151] 

The portions different from those according to the 
first embodiment will mainly be described. 
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[0152] 

FIG. 18 shows an example of the procedure according to 
this example. Note that the flowchart shown in FIG. 18 
includes only a process (basically, corresponding to the 
process which is performed by the control unit 302) which is 
performed when a certain region in a video which is being 
displayed during reproduction of the video has been 
specified by using a pointing device, such as a mouse cursor. 
Since the overall flow is similar to that of the flowchart 
shown in FIG. 16, different portions will mainly be 
described (steps S41, S42, S45 and S46 are similar to steps 
S31, S32, S35 and S36) . 

[0153] 

In step S43, the coordinates of the position of the 
characteristic point of an object in a displayed frame 
number are calculated from object region data. When 
a plurality of characteristic points exist, the coordinates 
of all of the characteristic points are calculated. 

[0154] 

In step S44, the distance between the position of the 
characteristic point calculated in step S43 and the 
coordinates specified by clicking. is calculated for all of 
the characteristic points. Then, it is determined whether 
or not one or more characteristic point positioned distant 
for a distance shorter than a predetermined threshold value. 
Alternatively, a process for calculating the distance for 
a certain characteristic point and comparing the distance 
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with a predetermined threshold value is repeated. When one 
characteristic point positioned distant for a distance 
shorter than the threshold value is detected, the process is 
interrupted. If one or more characteristic points distant 
for a distance shorter than the threshold value exits, the 
process proceeds to step S45. If no characteristic point of 
the foregoing type does not exist, the process proceeds to 
step S46. 

[0155] 

As a result of the foregoing process, display of 
related information can be performed in accordance with the 
coordinates of the characteristic point of the object when 
a portion adjacent to the region of the interest has been 
specified by an operation using a pointing device or the 
like . 

[0156] 
(Fourth Embodiment) 

A fourth embodiment will now be described with which 
an object region having related information which can be 
displayed is clearly displayed (communicated to a user) by 
using object region data. In the foregoing case, the object 
having related information which can be displayed must 
previously be supplied with object region data including 
a pointer for pointing the related information. 

[0157] 

The bloclc structure of this embodiment is similar to 
that according to, for example, the third embodiment (see 
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FIG. 15) . 

[0158] 

As a matter of course, also this embodiment can be 
realized by software. 
[0159] 

A case in wliicln the object region iias been expressed as 
an approximate figure similar to the first embodiment will 
now be described. 
[0160] 

FIG. 19 sliows an example of a process according to tliis 
embodiment . 
[0161] 

An example case in wliicli tlie approximate figure is 
an ellipse will now be described. As a matter of course, 
a similar process is performed in a case of anotlraer 
approximate figure . 
[0162] 

In step S51, the frame number of a video which is being 
displayed is acquired. Note that a time stamp may be 
employed as a substitute for the frame number (hereinafter 
the frame number is employed) . 

[0163] 

In step S52, an object having the frame number acquired 
in step S51 and existing in the video is selected. The 
selection is performed by detecting data having a displayed 
frame number between the leading frame number of object 
region data given to the video and the trailing frame number. 
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[0164] 

In step S53, data of a spline function (see FIGS. 6 and 
7) is extracted from object region data of the object 
selected in step S52 . Then, the coordinates of 
representative points of an approximate figure (or a region 
having related information) in the displayed frame are 
calculated. 

[0165] 

In step S54, a reference to the figure type ID of 
object region data is made to obtain an approximate figure 
expressed by the representative points calculated in step 
S53. Then, display of the image in each approximate figure 
(for example, an ellipse region) is changed. 
[0166] 

The change may be performed by a variety of methods. 
When the approximate figure is, for example, an ellipse, the 
brightness of the image in the ellipse region is intensified 
by a predetermined value. Assuming that the degree of 
intensification is AY, the brightness before the change of 
the display is Y and an upper limit of the brightness which 
can be displayed is Ymax, each pixel in the ellipse is 
displayed with brightness of MIN(Y + Ay, Ymax) . Pixels on 
the outside of the ellipse are displayed with brightness of 
Y. Note that MIN(a, b) is a function taking a smaller value 
of a and b. 

[0167] 

FIG. 2 0 shows an example with which the object region 
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is displayed by the method with which the brightness is 
intensified (in FIG. 20, hatching indicates no change in the 
brightness and no hatching indicates intensified brightness) . 
FIG. 20(a) shows a screen 1000 which is in a state in which 
the display change process in step S54 has not been 
performed. Reference numeral 1001 represents an object 
having object region data in the vicleo. A screen 1002 shown 
in FIG. 20(b) is displayed after the change in the display 
in step S54 has been performed. Reference numeral 1003 
represents an ellipse region approximating the object region 
1001. Display of only the inside portion of the ellipse 
region 1003 is brightened. Thus, a fact that the object is 
an object which permits display or the like of related 
information can be recognized. 
[0168] 

When one object is approximated with a plurality of 
approximate figures (refer to approximate figure number 
shown in FIGS. 5 and 12), the foregoing process is performed 
for each approximate figure. 

[0169] 

In step S55, it is determined whether or not another 
object, the display of which must be changed, exists. 
A determination is made whether or not a non-processed 
object having a display frame number which is between the 
leading frame number and the trailing frame number exists. 
If the non-processed object exists, the process from step 
S52 is repeated. If no object of the foregoing type exists. 
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the process is completed. 
[0170] 

As described above, display of an object region having 
the related information among the regions of the object in 
the video which is specified by using object region data is 
changed. Thus, whether or not the related information 
exists can quickly be detected. 
[0171] 

A method of indicating the object region which permits 
display or the li]ce of related information may be the above- 
mentioned method with which the brightness in the object 
region is changed. Any one of a variety of methods may be 
employed. A variety of the methods will now be described. 
The procedure of each process using object region data is 
basically similar to the flowchart shown in FIG. 19. 
Therefore, step S54 is changed to a corresponding process. 
[0172] 

A display method shown in FIG. 21 is a method of 
displaying the position of an object having related 
information on the outside of an image 1600. Reference 
numerals 1601 and 1602 represent objects having related 
information. Reference numerals 1603 and 1604 represent 
bars for displaying the position of the object in the 
direction of the axis of ordinate and in the direction of 
the axis of abscissa. Display 1605 and display 1606 
correspond to the object 1601 having related information. 
FIG. 21 shows a structure that bars serving as marks are 
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displayed in the regions in which the region 1601 are 
projected in the direction of the axis of ordinate and in 
the direction of the axis of abscissa. Similarly, reference 
numerals 1607 and 1608 represent bars for displaying the 
object region 1602. 
[0173] 

A state of projection of the object region in the 
foregoing directions can easily be obtained by using the 
coordinates of the representative points of the approximate 
figure in the frame obtained from data of the approximate 
figure of object region data and the figure type ID as 
described in the embodiments. 
[0174] 

It is preferable that the region of a different object 
is indicated with a bar displayed in a different manner (for 
example, a different color) . 

[0175] 

The method according to this embodiment causes a user 
to specify the inside portion of the image in accordance 
with the bars 1603 and 1604 displayed in the vertical and 
horizontal directions on the outside of the image 1600 by 
using a pointing device. Thus, related information can be 
displayed. 

[0176] 

FIG. 22 shows another display method with which the 
position of an object having related information is 
displayed on the outside of an image 1700. Objects 1701 and 
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1702 each having related information exist in the image 1700. 
The position of the object having related information is 
indicated by an object-position indicating bars 1703 and 
1704. As distinct from the example shown in FIG. 21, each 
display bar indicates only the position of the centroid the 
object as a substitute for the object region. Circles 1705 
and 1706 indicate the centroid of the object 1701. Circles 
1707 and 1708 indicate the centroid of the object 1702. 
[0177] 

Also the centroid of the object region in the foregoing 
directions can easily be obtained in accordance with the 
coordinates of the representative point of the approximate 
figure in the frame obtained from data of the approximate 
figure of object region data and the figure type ID. 

[0178] 

The foregoing method enables display which can easily 
be recognized because the size of display on the object 
position indicating bar can be reduced if the object has 
a large size or many objects exit. 

[0179] 

FIG. 23 shows an example of a display method with which 
a related information list is displayed on the outside of 
an image 1800. The image 1800 contains objects 1801 and 

1802 each having related information. Reference numeral 

1803 represents a list of objects each having related 
information. The list 1803 shows information of objects 
each having related information in the image frame which is 



being displayed. In the example shown in FIG. 23, names of 
objects are displayed which are obtained as a result of 
retrieving related information from object region data of 
the objects existing in the frame. 
[0180] 

The foregoing method permits a user to cause related 
information to be displayed by specifying the name shown in 
the related information list 1803 as well as the specifying 
the region 18.01 or 1802 with the pointing device. Since 
also instruction of the number shown in the list 1803 
enables related information to be displayed, the foregoing 
structure can be employed in a case of a remote control 
having no pointing device. 

[0181] 

FIG. 24 shows a display method with which objects 1901 
and 1902 existing in an image 1900 and each having related 
information are indicated with icons 1903 and 1904 to 
indicate existence of related information. The icon 1903 
corresponds to the object 1901, while the icon 1904 
corresponds to the object 1902. 

[0182] 

Each icon can be displayed by obtaining an approximate 
figure as described above,. by cutting a rectangle region 
having a predetermined size including the obtained 
approximate figure from video data in the frame and by, for 
example, arbitrarily contracting the cut rectangle region. 



- 71 - 



[0183] 

The foregoing method enables related information to be 
displayed by directly specifying the icon as well as 
specifying the object region in the video. 

[0184] 

FIG. 25 shows an example of a display method configured 
to display a map indicating the object region having related 
information so as to indicate existence of related 
information. An image 2000 includes objects 2001 and 2002 
each having related information. Reference numeral 2 003 
represents a map of the regions of the objects each having 
related information. The map 2003 indicates the positions 
of the regions of the objects each having related 
information in the image 2000. Reference numeral 2004 
represents the object 2001, while reference numeral 2005 
represents the object 2002. 
[0185] 

The map 2003 has a form obtained by reducing the image 
2000 and arranged to display only the images of the object 
regions (only the approximate figures obtained as described 
above are displayed at the corresponding positions in the 
contracted image) . 

[0186] 

It is preferred that a region indicating bar has 
different display forms with respect to the different 
objects. 



- 72 - 



[0187] 

The foregoing method enables related information to be 
displayed by specifying the object region displayed on the 
map 2003 as well as direct specification of an object in the 
image 2 000, 

[0188] 

FIG. 2 6 shows an example of the display method with 
which specification of an object existing in the image and 
having related information is facilitated by using 
a pointing device by controlling reproduction rate of the 
image at the position of the mouse cursor. Reference 
niomerals 2100 and 2102 represent the overall bodies of the 
display screens and reference numerals 2101 and 2103 
represent regions on the display screens on which images are 
being displayed. In the display screen 2100 shown in 
FIG. 26(a), a mouse cursor 2104 is positioned on the outside 
of the image 2101 so that the image is reproduced at 
a normal display rate. In the display screen 2102 shown in 
FIG. 26(b), the mouse cursor 2105 exists in the image region 
2103. Therefore, display rate of the image is lowered or 
displayed image is frozen. 

[0189] 

Another structure may be employed as a substitute for 
the above-mentioned structure in which image display rate is 
always lowered or the displayed image is frozen when the 
mouse cursor has entered the image region. That is, whether 
or not an object having related information exists in the 



frame is determined (determination is made by comparing the 
frame number and the leading frame number/trailing frame 
number with each other) . If the object having related 
information exists in the frame, the image display rate is 
lowered or the displayed image is frozen. 
[0190] 

For example, an object which is moving at high speed in 
the video cannot sometimes easily be specified by using the 
mouse cursor. The foregoing method is arranged to change 
the reproducing speed according to the position of the mouse 
cursor. Thus, movement of the object can be slowed when the 
object is specified or the displayed image can be frozen. 
Hence it follows that instruction can easily be performed. 

[0191] 

FIG. 27 shows an example of the display method with 
which an object existing in the image and having related 
information can easily be specified by using the pointing 
device. Reference numeral 2500 represents an image which is 
being reproduced. Reference numeral 2501 represents 
a button for acquiring an image. When the button 2501 is 
depressed with a mouse pointer 2502, an image which has been 
displayed at the specified time can be acquired so as to be 
displayed on an acquired- image display portion 2503. The 
image 2500 is continuously reproduced even after the 
foregoing instruction has been performed with the button 
2501. Since the acquired image is displayed on the 
acquired- image display portion 2503 for a while, instruction 
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of an object which is being displayed in the acquired- image 
display portion 2503 enables related information of the 
specified object to be displayed. 
[0192] 

The button 2501 for acquiring an image may be omitted. 
A structure may be employed from which the button 2501 is 
omitted and with which an image can automatically be 
acquired when the mouse cursor 2502 enters the video display 
portion 2500. 

[0193] 

A structure may be employed with which whether or not 
an object having related information exists in the frame is 
determined when the button 2501 has been depressed or the 
mouse cursor has entered the image region (for example, 
a determination is made by comparing the frame number and 
the leading frame number/trailing frame number with each 
other) . Only when the object having related information 
exists in the frame, the image is acquired so as to be 
displayed. 

[0194] 

The foregoing method enables related information to 
easily be specified from a still image. 
[0195] 

The foregoing variations may be employed. Another 
method may be employed with which the region of an image 
which permits display or the lilce of related information is 
clearly displayed. Also a method may be employed with which 
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instruction is facilitated. Thus, a variety of methods for 
aiding the operation of the user may be employed. 
[0196] 

A case in which the object region is expressed as 
characteristic points of the object similarly to the second 
embodiment will now be described. 
[0197] 

Portions different from those according to the first 
embodiment will mainly be described. 
[0198] 

A flowchart is, in the foregoing case, a flowchart 
which is basically similar to that shown in FIG. 19 except 
for characteristic points being employed as a substitute for 
the representative points. Specifically, the coordinates of 
characteristic points of the approximate figure are 
calculated in step S53. 
[0199] 

FIG. 20 shows the structure that the brightness in the 
approximate figure 1003 corresponding to the object 1001 is 
intensified. If three or more characteristic points exist 
in the foregoing case, a polygon having the vertices which 
are the characteristic points may be formed. Moreover, the 
brightness of the inside portion of the polygon may be 
intensified. If two or more characteristic points exist, 
a figure of some Jcind may be formed which has the 
representative points which are the characteristic points. 
Moreover, the brightness in the figure may be intensified. 



- 76 - 



Alternatively, a figure, such as a circle, the center of 
which is each of the characteristic points and which has 
a somewhat large size is formed. Moreover, each of the 
formed figure, which must be displayed, is made conspicuous 
by means of brightness, color or blinking. 
[0200] 

The structure shown in FIG. 21 is arranged such that 
projection of the approximate figures corresponding to the 
objects 1601 and 1602 in the vertical and horizontal 
directions is displayed as the bar set 1605 and 1607 or the 
bar set 1606 and 1608. If three or more characteristic 
points exist in the foregoing case, a polygon having the 
vertices which are the characteristic points may be formed. 
Moreover, projection of the polygon in the directions of the 
two axes may be displayed as the bars. If two or more 
characteristic points exist, a rectangle having the vertices 
which are the characteristic points may be formed. Moreover, 
projection into the directions of the two axes may be 
displayed as the bars. If one characteristic point exists, 
the foregoing method shown in FIG. 22 may be employed with 
which the position of the centroid is displayed with circles 
in the bars. 

[0201] 

FIG. 24 shows the structure with which the image of 
an object is extracted by cutting in accordance with the 
approximate figure or the lilce so as to be displayed as 
an icon. Also in the foregoing case, the image of an object 
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can be extracted by cutting in accordance with the 
characteristic points so as to be displayed as an icon. 
[0202] 

FIG, 25 shows a structure that the approximate figures 
1903 and 1904 are displayed in a map. Also in the foregoing 
case, a figure of some kind formed in accordance with 
characteristic points as described above may be displayed as 
a map . 

[0203] 

The methods shown in FIGS. 23, 26 and 27 may employed 
in the foregoing case. 
[0204] 

The foregoing variations may be employed. Toother 
method may be employed with which the region of an image 
which permits display or the like of related information is 
clearly performed. Also a method may be employed with which 
instruction is facilitated. Thus, a variety of methods for 
aiding the operation of the user may be employed. 

[0205] 

Each of the foregoing structures may be realized by 
a recording medi\im storing a program for causing a computer 
to execute a predetermined means (or causing the computer to 
act as a predetermined means or causing the computer to 
realize a predetermined function) . 
[0206] 

The present invention in its broader aspects is not 
limited to the embodiments described herein. Accordingly, 



various modifications may be made without departing from the 
spirit or scope of . the general inventive concept. 
[0207] 

[Advantages of the invention] 

The present invention is configured such that the 
object region in a video is described as the parameter of 
a function approximating the trajectory obtained by 
arranging positional data of representative points of the 
approximate figure of the object region or the 
characteristic points of the object region in a direction in 
which frames proceed. Therefore^ the region of 
a predetermined object can be described with a small 
quantity of data. Moreover, creation and handling of 
data can easily be performed. 

[0208] 

According to the present invention, a user is able to 
easily instruct an object in a video and determine the 
obj ect . 

[Brief Description of the drawings] 
[FIG. 1] 

FIG. 1 is a diagram showing an example of the structure 
of an object-region-data generating apparatus according to 
a first embodiment of the present invention. 

[FIG. 2] 

FIG. 2 is diagrams showing a procedure for describing 
an object region in a video with object region data. 



[FIG. 3] 

FIG. 3 is a diagram showing an example of a process for 
approximating an object region with an ellipse. 
[FIG. 4] 

FIG. 4 is a diagram showing an example of a process for 
detecting a representative point of an approximate ellipse 
of an object region. 

[FIG. 5] 

FIG. 5 is a diagram showing an example of the structure 
of object region data. 
[FIG. 6] 

FIG. 6 is a diagram showing an example of the structure 
of data of an approximate figure in object region data. 
[FIG. 7] 

FIG. 7 is a diagram showing an example of the structure 
of data of a trajectory of a representative point in data of 
an approximate figure. 

[FIG. 8] 

FIG. 8 is a diagram showing an example of 
representative points when the approximate figure is 
a parallelogram. 

[FIG. 9] 

FIG. 9 is a diagram showing an example of 
representative points when the approximate figure is 
a polygon. 

[FIG. 10] 

FIG. 10 is a flowchart showing an example of 
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a procedure according to the first embodiment of the present 
invention. 

[FIG. 11] 

FIG. 11 is a diagram showing an example in which the 
object region in a video is expressed with a plurality of 
ellipses . 

[FIG. 12] 

FIG. 12 is a diagram showing an example of the 
structure of object region data including data of 
a plurality of approximate figures. 

[FIG. 13] 

FIG. 13 is a diagram showing an example of an object- 
region-data generating apparatus according to a second 
embodiment of the present invention. 

[FIG. 14] 

FIG. 14 is a flowchart showing an example of 
a procedure according to the second embodiment. 
[FIG. 15] 

FIG. 15 is a diagram showing an example of the 
structure of a video processing apparatus according to 
a third embodiment of the present invention. 

[FIG. 16] 

FIG. 16 is a flowchart showing an example of 
a procedure according to the third embodiment. 
[FIG. 17] 

FIG. 17 is a diagram showing an example of display of 
contents hyper media which uses object region data. 
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[FIG. 18] 

FIG. 18 is a flowchart showing another example of the 
procedure according to the third embodiment. 
[FIG. 19] 

FIG. 19 is a flowchart showing an example of 
a procedure according to a fourth embodiment of the present 
invention . 

[FIG. 20] 

FIG. 20 is diagrams showing an example of change in the 
display of an object region having related information. 
[FIG. 21] 

FIG. 21 is a diagram showing another example of the 
display of the position of an object region having related 
information . 

[FIG. 22] 

FIG. 22 is a diagram showing another example of the 
display of the position of an object region having related 
information . 

[FIG. 23] 

FIG. 23 is a diagram showing an example of display of 
a description list of an object region having related 
information . 

[FIG. 24] 

FIG. 24 is a diagram showing an example of display of 
an object region having related information with an icon. 
[FIG. 25] 

FIG. 25 is a diagram of an example of display of 
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an object region having related information with a map. 
[FIG. 26] 

FIG. 2 6 is diagrams showing an example of control of 
an image reproducing rate for facilitating instruction of 
an object region. 

[FIG. 27] 

FIG. 27 is a diagram showing an example which enables 
image capture for facilitating instruction of an object 
region . 

[Explanation of Reference numerals] 



100, 230 ... Video data storage portion, 

101 ... Region extracting portion, 

102 ... Region figure approximating portion, 

103 ... Figure-representative-point extracting 

portion, 

104 ... Representative point trajectory curve 

approximating portion, 

105, 235 ... Related information storage portion, 

106, 236 ... Region data storage portion, 

233 ... Characteristic-point extracting portion, 

234 ... Characteristic-point-curve approximating 

portion, 

301 ... Video data display portion, 

302 ... Control unit, 

303 ,., Related information display portion, 

304 ... Instruction input portion. 
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[Document] ABSTRACT 
[Abstract] 

[Object] It is an object of tlie present invention to provide 
a metliod of describing object region data wlnicl:! is capable of 
describing a desired object region in a video by using 
a small quantity of data and facilitating generation of 
data and tiandling of the same. 

[Means for Acliieving tlie Object] A metliod of describing 
object region data sucti tliat information about an arbitrary 
object region in a video is described over a plurality of 
continuous frames, the metliod comprising: identifying 
a desired object region 201 in a video according to at least 
either of a figure 202 approximated to the object region or 
a characteristic point of the object region; approximating 
a trajectory obtained by arranging positions of 
representative points 203 of the approximate figure 202 or 
the characteristic points of the object region in a direction 
in which frames 200 proceed with a predetermined function 
204; and describing information about the object region by 
using the parameter 205 of the function. 
[Elected Figure] FIG. 2 



